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We are concerned with the numerical resolution of backward 
stochastic differential equations. We propose a new numerical scheme 
based on iterative regressions on function bases, which coefficients are 
evaluated using Monte Carlo simulations. A full convergence analy- 
sis is derived. Numerical experiments about finance are included, in 
particular, concerning option pricing with differential interest rates. 

1. Introduction. In this paper we are interested in numerically approxi- 
mating the solution of a decoupled forward-backward stochastic differential 
equation (FBSDE) 

(1) St = So+ f\{s,Ss)ds+ f a{s,Ss)dWs, 

Jo Jo 

(2) Yt = $(S) + fis, Ss, Ys, Zs) ds - dWs. 

In this representation, S = {St ■0<t<T) is the d-dimensional forward com- 
ponent and Y = (Yt :0 <t <T) the one-dimensional backward one (the ex- 
tension of our results to multidimensional backward equations is straight- 
forward). Here, TV is a q-dimensional Brownian motion defined on a filtered 
probability space {0,,T,¥, {Tt)o<t<T)-, where {Tt)t is the augmented natural 
filtration of W. The driver /(•,-,•,•) and the terminal condition $(•) are, 
respectively, a deterministic function and a deterministic functional of the 
process S. The assumptions (H1)-(H3) below ensure the existence and the 
uniqueness of a solution (S,Y,Z) to such equation (l)-(2). 
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Applications of BSDEs. Such equations, first studied by Pardoux and 
Peng [26] in a general form, are important tools in mathematical finance. 
We mention some applications and refer the reader to [10, 12] for numerous 
references. In a complete market, for the usual valuation of a contingent 
claim with payoff '^(S), Y is the value of the replicating portfolio and Z is 
related to the hedging strategy. In that case, the driver / is linear w.r.t. Y 
and Z. Some market imperfections can also be incorporated, such as higher 
interest rate for borrowing [4]: then, the driver is only Lipschitz continuous 
w.r.t. Y and Z . Related numerical experiments are developed in Section 6. 
In incomplete markets, the Follmer-Schweizer strategy [14] is given by the 
solution of a BSDE. When trading constraints on some assets are imposed, 
the super-replication price [13] is obtained as the limit of nonlinear BSDEs. 
Connections with recursive utilities of Duffie and Epstein [11] are also avail- 
able. Peng has introduced the notion of ^-expectation (here g is the driver) 
as a nonlinear pricing rule [28]. Recently he has shown [27] the deep connec- 
tion between BSDEs and dynamic risk measures, proving that any dynamic 
risk measure {£t)o<t<T (satisfying some axiomatic conditions) is necessarily 
associated to a BSDE (Yt)o<t<T (the converse being known for years). The 
least we can say is that BSDEs are now inevitable tools in mathematical 
finance. Another indirect application may concern variance reduction tech- 
niques for the Monte Carlo computations of expectations, say E(<I>) taking 
/ = 0. Indeed, Zg dWs is the so-called martingale control variate (see [24], 
for instance). Finally, for applications to semi-linear PDEs, we refer to [25], 
among others. 

The mathematical analysis of BSDE is now well understood (see [23] for 
recent references) and its numerical resolution has made recent progresses. 
However, even if several numerical methods have been proposed, they suffer 
of a high complexity in terms of computational time or are very costly in 
terms of computer memory. Thus, their uses in practice on real problems are 
difficult. Hence, it is still topical to devise more efficient algorithms. This 
article contributes in this direction by developing a simple approach, based 
on Monte Carlo regression on function bases. It is in the vein of the general 
regression approach of Bouchard and Touzi [6] , but here it is actually much 
simpler because only one set of paths is used to evaluate all the regression 
operators. Consequently, the numerical implementation is easier and more 
efficient. In addition, we provide a full mathematical analysis of the influence 
of the parameters of the method. 

Numerical methods for BSDEs. In the past decade, there have been sev- 
eral attempts to provide approximation schemes for BSDEs. First, Ma, Prot- 
ter and Yong [22] propose the four step scheme to solve general FBSDEs, 
which requires the numerical resolution of a quasilinear parabolic PDF. In 
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[2], Bally presents a time discretization scheme based on a Poisson net: 
this trick avoids him using the unknown regularity of Z and enables him 
to derive a rate of convergence w.r.t. the intensity of the Poisson process. 
However, extra computations of very high-dimensional integrals are needed 
and this is not handled in [2]. In a recent work [29], Zhang proves some L2- 
regularity on which allows the use of a regular deterministic time mesh. 
Under an assumption of constructible Junctionals for <l> (which essentially 
means that the system can be made Markovian, by adding d' extra state 
variables), its approximation scheme is less consuming in terms of high- 
dimensional integrals. If for each of the d + d! state variables, one uses M 
points to compute the integrals, the complexity is about M'^^'^ per time 
step, for a global error of order say (actually, an analysis of the global 
accuracy is not provided in [29]). This approach is somewhat related to the 
quantization method of Bally and Pages [3] , which is an optimal space dis- 
cretization of the underlying dynamic programming equation (see also the 
former work by Chevance [8], where the driver does not depend on Z). We 
should also mention the works by Ma, Protter, San Martin and Soledad [21] 
and Briand, Delyon and Memin [7] , where the Brownian motion is replaced 
by a scaled random walk. Weak convergence results are given, without rates 
of approximation. The complexity becomes very large in multidimensional 
problems, like for finite differences schemes for PDEs. Recently, in the case 
of path-independent terminal conditions $(S) = (/>(5t), Bouchard and Touzi 
[6] propose a Monte Carlo approach which may be more suitable for high- 
dimensional problems. They follow the approach by Zhang [29] by approx- 
imating (l)-(2) by a discrete time FBSDE with time steps [see (5)-(6) 
below], with an L2-error of order N~^^^. Instead of computing the condi- 
tional expectations which appear at each discretization time by discretizing 
the space of each state variable, the authors use a general regression opera- 
tor, which can be derived, for instance, from kernel estimators or from the 
Malliavin calculus integration by parts formulas. The regression operator at 
a discretization time is assumed to be built independently of the underlying 
process, and independently of the regression operators at the other times. 
For the Malliavin calculus approach, for example, this means that one needs 
to simulate at each discrete time, M copies of the approximation of (1), 
which is very costly. The algorithm that we propose in this paper requires 
only one set of paths to approximate all the regression operators at each dis- 
cretization time at once. Since the regression operators are now correlated, 
the mathematical analysis is much more involved. 

The regression operator we use in the sequel results from the L2-projection 
on a finite basis of functions, which leads in practice to solve a standard 
least squares problem. This approach is not new in numerical methods for 
financial engineering, since it has been developed by Longstaff and Schwartz 
[20] for the pricing of Bermuda options. See also [5] for the option pricing 
using simulations under the objective probability. 
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Organization of the paper. In Section 2 we set the framework of our 
study, define some notation used throughout the paper and describe our 
algorithm based on the approximation of conditional expectations by a pro- 
jection on a finite basis of functions. We also provide some remarks related 
to models in finance. 

The next three sections are devoted to analyzing the influence of the 
parameters of this scheme on the evaluation of Y and Z. Note that approx- 
imation results on Z were not previously considered in [6]. In Section 3 we 
provide an estimation of the time discretization error: this essentially fol- 
lows from the results by Zhang [29]. Then, the impact of the function bases 
and the number of simulated paths is separately discussed in Section 4 and 
in Section 5, which is the major contribution of our work. Since this least 
squares approach is also popular to price Bermuda options [20], it is crucial 
to accurately estimate the propagation of errors in this type of numerical 
method, that is, to ensure that it is not explosive when the exercise fre- 
quency shrinks to 0. L2-estimates and a central limit theorem (see also [9] 
for Bermuda options) are proved. 

In Section 6 explicit choices of function bases are given, together with nu- 
merical examples relative to the pricing of vanilla options and Asian options 
with differential interest rates. 

2. Assumptions, notation and the numerical scheme. 

2.1. Standing assumptions. Throughout the paper we assume that the 
following hypotheses are fulfilled: 

(HI) The functions (t, x) ^ b{t,x) and {t,x) a{t,x) are uniformly Lips- 

chitz continuous w.r.t. {t,x) G [0,T] x R^. 
(H2) The driver / satisfies the following continuity estimate: 

\f{t2,X2,y2,Z2) - f{ti,Xi,yi,Zi)\ 

< Cf{\t2 - ti\^/^ + \x2 - xi\ + |y2 - yil + \z2 - Zl\) 

for any (ti, xi, yi, zi), (ts, xs, 2/2, ^2) S [0, T] x x M x M". 
(H3) The terminal condition $ satisfies the functional Lipschitz condition, 
that is, for any continuous functions and s^, one has 

|$(si)-c^(s2)|<C sup \sl-s^,\. 

te[o,T] 

These assumptions (H1)-(H3) are sufficient to ensure the existence and 
uniqueness of a triplet (S,Y,Z) solution to (1)~(2) (see [23] and references 
therein). In addition, the assumption (H3) allows a large class of terminal 
conditions (see examples in Section 2.4). 
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To approximate the forward component (1), we use a standard Euler 
scheme with time step h (say smaller than 1), associated to equidistant 
discretization times {tk = kh = kT /N)Q<k<N ■ This approximation is defined 
by Sq = So and 

(3) Sf[^^=Sll + b{tk,S^Jh + a{tk,Sll){Wt^^^-Wt,). 

The terminal condition ^*(S) is approximated by ^^{P^), where is 
a deterministic function and {Pt^)o<k<N is a Markov chain, whose first 
components are given by those of {S^)o<k<N- In other words, we even- 
tually add extra state variables to make Markovian the implicit dynam- 
ics of the terminal condition. We also assume that P^^ is J^t^. -measurable 
and that E[$'^(P^^)]^ < oo. Of course, this approximation strongly depends 
on the terminal condition type and its impact is measured by the error 
E|$(S) - <^>^(P/J^)p (see Theorem 1 later). Examples of function are 
given in Section 2.4. 

Another hypothesis is required to prove that a certain discrete time BSDE 
{Yt^)k can be represented as a Lipschitz continuous function y^{tk,-) of 
Pf^ (see Proposition 3 later). This property is mainly used in Section 6 on 
numerical experiments to derive relevant regression approximations. 

(H4) The function <I>^(-) is Lipschitz continuous (uniformly in A^) and 
sup^ |$^(0)| < oo. In addition, Ej^''^-^ - ^''^-^'p + E|<J^;;^ - 

P^^'^+f? < C\x - x'\^ uniformly in and A. 

Here, {P^^'^^'^)k stands for the Markov chain {P^)k starting at P^^ = x. 

Moreover, since we deal with the flow properties of {P^^)k, we use the stan- 
dard representation of this Markov chain as a random iterative sequence of 
the form P^^ = Pk^{Uk,Pt^_^), where {F^)k are measurable functions and 
{Uk)k are i.i.d. random variables. 

2.2. Notation. 

Projection on function bases. 

• The L2(r2,P) projection of the random variable [/ on a finite family (f) = 
[(^1, . . . , (considered as a random column vector) is denoted by V^{U). 
We set TZ^{U) = U — V(f,{U) for the projection error. 

• At each time tk, to approximate, respectively, Yt,, and Zi^t^ {^i,tk is the Zth 
component of Ztf,, !</<(?), we will use, respectively, finite-dimensional 
function bases Po,k{Pt^ ) and pi^k{Pi^ ) (1 < ^ < <?), which may be also writ- 
ten po,fc and pi^k (1 < ^ < ?) to simplify. In the following, for convenience, 
both (pi,fc(-)) and {pi,k{Pt^)) are indifferently called function basis. Ex- 
plicit examples are given in Section 6. The projection coefficients will be 
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denoted Qo,fc, ai,fc, • • • , ag,fc (viewed as column vectors). We assume that 
< oo (0 < / < and w.l.o.g. that K{pi^kPt k) invertible, which en- 
sures the uniqueness of the coefficients of the projection j, {0 <l <q)- 

• To simplify, we write fk{ao,k, aq^k) or fk{ak) for f{tk, Sf^, ao,fc -po.fc, • • • , aq,k ■ 
Pq,k) [S^ is the Euler approximation of St^. , see (3)]. 

• For convenience, we write Efc(-) = E{-\J^tk)- We put AWk = Wt,,^-^ — Wt^ 
(and AWi^k component-wise) and define Vk the (column) vector given by 

M =iP0,k^Pl,k^^---,Pg,k^)- 

• For a vector x, \x\ stands, as usual, for its Euclidean norm. The relative 
dimension is still implicit. For an integer M and x S M*'^, we put = 
Jt Sm=i l^mP- For a set of projection coefficients a = (ao, . • . , aq), we set 
|a| = maxo<z<q |a/| (the dimensions of the ai may be different). For the 
set of basis functions at a fixed time tk , \pk \ is defined analogously. 

• For a real symmetric matrix A, \\A\\ and ||^||f are, respectively, the max- 
imum of the absolute value of its eigenvalues and its Frobenius norm 
(defined by = a?j)- 

We refer to Section 6 for explicit choices of function bases, but to fix ideas, a 
possible choice could be to define, for each time t^, grids {xj f,:l <i < n)Q<^i<^q 
and define pi,ki') as the basis of indicator functions of the open Voronoi 
partition [17] associated to (3;} . : 1 < i < n), that is, pi fc(-) = (!(-.» (•))i<i<n) 

' ' l,k 

where C/^ = {x : |x — ^| < |x — x| ^| , Vj 7^ i}. 

Simulations. In the following, M independent simulations of (P^^ )o<k<N, 
{AWk)o<k<N^i will be used. We denote them {{Pt^'"')o<k<N)i<m<M , {{AWJp)o<k<N~i)i<m<M- 



The values of basis functions along these simulations are denoted (p[ 



,m 
k 



( j-,N,m\\ 

PlK-^tk ))0<l<q,0<k<N-l,l<m<M- 



• Analogously to fk{ao,k,---,aq,k) or fk{ak), we denote /^'(ao,fc, • • • , Og.fc) 
or/-(afc)for/(tfc,5,^ 

We define the following: 



or /r(afc) for /(t^, 5^^'"", ao,fc ■Po,k^---' ' 



AM/'" AM/™- 

the (column) vector v",^ by [v^]* = (p-,p--^, . . . ,<,*^); 

the matrix V,'' = ^ E^'=i ^r[<]*; 

the matrix P^^ = ^ E^iiC^brfcl* (« < ^ < l)- 



Truncations. To ensure the stability of the algorithm, we use thresh- 
old techniques, which are based on the following notation: 

• In Proposition 2 below, based on BSDEs' a priori estimates, we explicitly 
build some M- valued functions (/offc)o<z<g,o<fc<Af-i bounded from below 
by 1. We set (P,^) = [p^^P,^),'. . . ,pj,(<)]*. 
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• Associated to these estimates, we define (random) truncation functions 

where ^ : M R is a C^-function, such that ^(x) = x for \x\ < 3/2, |^|oo < 2 
and l^'loo < 1- 

In the next computations, C denotes a generic constant that may change 
from hne to hne. It is still uniform in the parameters of our scheme. 



2.3. The numerical scheme. We are now in a position to define the 
simulation-based approximations of the BSDE (l)-(2). The statements of 
approximation results and their proofs are postponed to Sections 3, 4 and 5. 

Our procedure combines a backward in time evaluation (from time = T 
to time to = 0), a fixed point argument (using i = 1, . . . , I Picard iterations), 
least squares problems on M simulated paths (using some function bases). 

Initialization. The algorithm is initialized with 1^^'*'^'*^ = ^^{P^) (in- 
dependently of i and /). Then, the solution (Yt,^, Zi^t^, ■ ■ ■ , Zg^t^) at a given 
time tfc is represented via some projection coefficients {a]'f,' )o<i<q by 

^^N,i,I,M I i.I.M X rTrzNAJM -AT / rr i,I,M n 

\ =Po,fc(«o,fc ■Po,k), v/i^^j^ = pi^kiVhai,^ -pi^k) 

{pQf. and pf^i^ are the truncations introduced before). We now detail how the 
coefficients are computed using independent realizations {{P^'"^)o<k<N)i<m<M , 
{iAW^)o<k<N^i) 



l<m<M- 



Backward in time iteration at time tk < T. Assume that an approxi- 

, ■ ^^N, I.I.M -TV (- IJ,M \ ■ -u -M. J J J. iirN,I,I,M,m 

mation y^^^^' ■= Po,k+ii<^o,k+i ' Po,k+i) is built, and denote Fj^^; = 
Po'k+ii^Ok+i ' Po^k+i) realization along the mth simulation. 

— > For the initialization i = of Picard iterations, set y/^'^'^'^^ = and 

Z^^ =0, that IS, y = (0 < / < (7) . 

— > For i = 1, . . . ,/, the coefficients cx^'^^ = ("/'fc'^)o</<g are iteratively ob- 
tained as the argmin in (ao, . . . ,aq) of the quantity 

(4) ^ - ao ■ P^,k + hfrial'''''') - E «r AW^,™ 

m=l \ 1=1 

If the above least squares problem has multiple solutions (i.e., the empirical 
regression matrix is not invertible, which occurs with small probability when 
M becomes large), we may choose, for instance, the (unique) solution of 
minimal norm. Actually, this choice is arbitrary and has no incidence on the 
further analysis. 
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The convergence parameters of this scheme are the time step h (/i — > 0), 
the function bases, the number of simulations M (M — > +00). This is fully 
analyzed in the following sections, with three main steps: time discretization 
of the BSDE, projections on bases functions in L2(ri,P), empirical projec- 
tions using simulated paths. An estimate of the global error directly follows 
from the combination of Theorems 1, 2 and 3. We will also see that it is 
enough to have / = 3 Picard iterations (see Theorem 3) . 

The intuition behind the above sequence of least squares problems (4) 
is actually simple. It aims at mimicking what can be ideally done with an 
infinite number of simulations, Picard iterations and bases functions, that 
is, 

(y,f , Z,^) = arg inf E(y,f^^ -Y + hf{tk, 5,^, Y, Z) - Z^Wk)\ 
(y,z)eL2(.Ftj 

where, as usual, 'L2{J^tk) stands for the square integrable and .Fj^. -measurable, 
possibly multidimensional, random variables. This ideal case is an appoxi- 
mation of the BSDE (2) which writes 

Yt,^, + f'*' f{s, Ss,Ys,Zs) ds = Yt, + f"^' Zs dWs 

over the time interval {Yt^)k will be interpreted as a discrete time 

BSDE (see Theorem 1). 

2.4. Remarks for models in finance. Here, we give examples of drivers 
/ and terminal conditions $(S) in the case of option pricing with different 
interest rates [4]: R for borrowing and r for lending with R>r. Assume 
for simplicity that there is only one underlying risky asset (d = 1) whose 
dynamics is given by the Black-Scholes model with drift fi and volatility a 
{q=l): dSt = St{fidt + adWt). 

• Driver: If we set f{t,x,y,z) = —{yr + z9 — (y — ^)~{R — r)}, where 6 = 

Yt is the value at time t of the self- financing portfolio replicating the 
payoff $(S) [12]. In the case of equal interest rates R = r, the driver is 
linear and we obtain the usual risk-neutral valuation rule. 

• Terminal conditions: A large class of exotic payoffs satisfies the functional 
Lipschitz condition (H3). 

- Vanilla payoff: cI>(S) = ^{St). Set Pf^ = 5,^ and $^(Pi^) = HPO- 
Under (H3), it gives E|$^(P/^) - $(S)|2 < Ch. 

- Asian payoff: ^S) = cPiSrJ^Stdt). Set P,^ = (5^^, /lE-Jo ^tf ) and 

= (P{P^^). For usual functions (p, the L2 -error is of order 1/2 
w.r.t. h. More accurate approximations of the average of S could be 
incorporated [18]. 
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- Lookback payoff: $(S) = 0(S't, min^gp^T] -S**, maxjgp^T] St). Set (P^^) 
(t>{Pti) with = {S^^,mmi<kS^ ,ma,Xi<kS^). In general, this in- 
duces an L2-error of mag nitude y^hlog{l/h) [29]. The rate Vh can 
be achieved by considering the exact extrema of the continuous Euler 
scheme [1]. 

Note also that (H4) is satisfied on these payoffs. 

We also mention that the price process {St)t is usually positive coordinate- 
wise, but its Euler scheme [defined in (3)] does not enjoy this feature. This 
may be an undesirable property, which can be avoided by considering the 
Euler scheme on the log-price. With this modification, the analysis below is 
unchanged and we refer to [15] for details. 

3. Approximation results: step 1. We first consider a time approxima- 
tion of equations (1) and (2). The forward component is approximated us- 
ing the Euler scheme (3) and the backward component (2) is evaluated in 
a backward manner. First, we set YJ^ = ^^{Pj^). Then, {Yj-^ , -^t^)o<fe<Ar-i 
are defined by 



(5) z!^,^ = Ie,{Y,IAW,,), 

(6) Y,^ = Efc(y,f^^ ) + hf{t,,sfl, , zl 



Using, in particular, the inequality \ZilJ < Efc(Y'j^^)2, it is easy to 

see by a recursive argument that YJ^ and Z^ belong to Ij2{J^tk)- It is also 
equivalent to assert that they minimize the quantity 

(7) E(y,f^^ - y + hf{tk,sll,Y, z) - zAw^f 



over \i2{J'tk) random variables (y, Z). Note that Y(^ is well defined in (6), 
because the mapping Y i— > Efc(yj^ J -|- hf{tk,S^,Y, Z^^) is a contraction in 
L2(-^tfc)) for h small enough. The following result provides an estimate of 
the error induced by this first step. 



Theorem 1. Assume (H1)-(H3). For h small enough, we have 

N-l f^^^ 

max E|yj, - y^l^ + y / E|Zt - [^ dt 

0<k<N ' ' Jt, ' * ' 



<C((l + |5o|2)/i + E|cI>(S)-<I>^(P,^)|2) 



Proof. From [29], we know that the key point is the L^-regularity of Z. 
Here, under (H1)-(H3), Z is cadlag (see Remark 2.6.ii in [29]). Thus, The- 
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orem 3.1 in [29] states that 

^e/ \Zt-Ztfdt<C{l + \So\^)h. 

k=0 

With this estimate, the proof of Theorem 1 is standard (see, e.g., the proof 
of Theorem 5.3 in [29]) and we omit details. □ 

Owing to the Markov chain {Pt^)o<k<N ^ the independent increments 
{AWk)o<k<N-~i and (5)-(6), we easily get the following result. 

Proposition 1. Assume (H1)-(H3). Fork small enough, we have 

(8) y,f = yj^iPfj, Zl^,^ = zf^,{Pil) forO<k<Nand l<l<q, 
where {yj^ {■))k and (z/^(-))a;,/ are measurable functions. 

It will be established in Section 6 that they are Lipschitz continuous under 
the extra assumption (H4). 

4. Approximation results: step 2. Here, the conditional expectations 
which appear in the definitions (5)-(6) of Y^f and Z/]^^ (1 < / < g) are re- 
placed by a L2(J^,P) projection on the function bases po,fc andp^^^ (1 < ^ < q)- 
A numerical difficulty still remains in the approximation of Y/^ in (6), which 
is usually obtained as a fixed point. To circumvent this problem, we propose 
a solution combining the projection on the function basis and / Picard iter- 
ations. The integer J is a fixed parameter of our scheme (the analysis below 
shows that the value I = 3 is relevant). 

Definition 1. We denote by y/^'^'^ the approximation of YJ^ , where 
i Picard iterations with projections have been performed at time tk and / 
Picard iterations with projections at arry time after t^. Analogous notation 
stands for . We associate to Y^^ and Z^ their respective projec- 

tion coefficients and aj'^, on the function bases po,fc and pi^k ^ q)- 

We now turn to a precise definition of the above quantities. We set 
-^7V,i,/ _ ^N^pN^-^^ independently of i and /. Assume that ^^('^ is obtained 

and let us define y^^'*'^, Z^f^'^ for z = 0, . . . , I. We begin with Y^'^'^ = and 

Z^'^'^ = 0, corresponding to aj'l = (0 < / < g). By analogy with (7), we 

set a^'^ = (o; fc)o<«<g as the argmin in (ao, . . . , aq) of the quantity 

(9) E (y/;;i['' - ao ■ po,k + hfkial-^'') " E " m^^i^k^ . 
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Iterating with i = 1,...,/, at the end we get {oi\'l)o<i<q, thus, y/^'^'^ = 

^ok ' Po,k and Z^^'^'^ = a\'l ■ pi^k (1 < ^ < The least squares problem (9) 
can be formulated in different ways but this one is more convenient to get 
an intuition on (4) . The error induced by this second step is analyzed by the 
following result. 

Theorem 2. Assume (H1)-(H3). For h small enough, we have 



max E\Yf'''' - Y,^\' + V E\Z^'''' - ^ 

~ ~ fe=0 



fc=0 fc=0 1=1 



The above result shows how projection errors cumulate along the back- 
ward iteration. The key point is to note that they only sum up, with a 
factor C which does not explode as N ^ oo. These estimates improve those 
of Theorem 4.1 in [6] for two reasons. First, error estimates on are pro- 
vided here. Second, in the cited theorem, the error is analyzed in terms of 

^I^Po fe(-^tf^'^'^)P ^l^p; fe(^/^)f '^)P s^y- hence, the influence of function 
bases is still questionable, since it is hidden in the projection residuals TZp^. 

and also in the random variables yI^'^'^ and Z^,'^'^ . Our estimates are rel- 

'^k l,tk 

evant to directly analyze the influence of function bases (see Section 6 for 
explicit computations). This feature is crucial in our opinion. Regarding the 
influence of /, it is enough here to have / = 2 to get an error of the same 
order as in Theorem 1. At the third step, / = 3 is needed. 

Proof of Theorem 2. For convenience, we denote (Sq) = 1 + 
jiSoP -|-E|$^(Pi^)p. In the following computations, we repeatedly use three 
standard inequalities: 

1. The contraction property of the L2-projection operator: for any random 
variable X e L2, we have E|-Pp, ,^(X)|2 < E|Xp. 

2. The Young inequality: V7 > 0, V (a, 6) G M^, (a + hf < (1 + 7/1)0^ + + 

3. The discrete Gronwall lemma: for any nonnegative sequences {ak)o<k<N , 
ih)o<k<N and (cfc)o<fc<Ar satisfying ak-i + Ck-i < {l + 'yh)ak + bk-i (with 
7 > 0) , we have Ok + E^=k^ Ci < e^(^-*'=) [ajv + Y.iJk ■ Most of the time, 
it will be used with a = 0. 
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Because AWk is centered and independent of {pi,k)o<i<q-, it is straightforward 
to see that the solution of the least squares problem (9) is given, for i > 1, 
by 

(10) = ^P,,,(<;('^ATy,,), 

(11) Y^/^^ = v,^, (y,f;('^ + hm. si. Yll^-'\ 

The proof of Theorem 2 may be divided in several steps. 

Stej) 1: a (tight) 'preliminary upper bound for 'ElZ^^'^l'^ . First note that 
Z^.'^'^ is constant for i>l. Moreover, the Cauchy-Schwarz inequality yields 

|E,(y,f;^'^AT^,,,)|2 = |E,(Kf;;'^ - Efc(y,f;('^)]ATy,,,)|2 < MEfcKf;!'? - 

[Efc(y^^^'^)]^). Since {pi^k)i is .T^tfc -measurable and owing to the contraction 
of the projection operator, it follows that 

(12) , 

<-(EKf;('^]^-E[Efc(y,f;^'^)]2). 

As it may be seen in the computations below, the term E[Eyfc(Y^^^'^)]^ in 

(12) plays a crucial role to make further estimates not explosive w.r.t. h. 

Step 2: L2 bounds for 5^^'*'^ and \/hZll'^ . Actually, it is an easy ex- 
ercise to check that the random variables and \/liZll'^ are square 
integrable. We aim at proving that uniform L2 bounds w.r.t. k are avail- 
able. Denote ' :y S L2(.FiJ ^ Pp„,,(y,f;('V /i/(tfc,5i^,y,<''-''')) G 

UiJ't,). Clearly, E|xf''(y2) - Xk^Yi)? < (CfhfnY^ - Yi\^ where Cf is 
the Lipschitz constant of /. Consequently, for h small enough, the appli- 
cation Xk'^ is contracting and has a unique fixed point Yf^'°°'^ E Ij2{J-'t^.) 
(remind that Z^^^'^ does not depend on i > 1). One has 

(13) y^f '"^'^ = V,,,, {Y,^;Y + hf{tk,sl Y,f^'\ zl/^% 

(14) Ejy^f - y,f '^'Y < (C//i)2*E|yjf 

since y/^'^'^ = 0. Thus, Young's inequality yields, for i > 1, 

E|y,f '^'Y < (^1 + i)E|y,f'°°'^ - y^f^'Y + (i + /i)E|y,f'°°'^'2 

(15) 

< (l + C7/i)E|yif'°°'^|l 
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The above inequality is also true for i = because Yt^'^'^ = 0. We now 
estimate E|l^^'°°'^p from the identity (13). Combining Young's inequality 

(with 7 to be chosen later), the identity T'po.fc (^tf+i'^) = ^Po.fc (I^fcKf+i'^])' 
the contraction of Vp^^ j. and the Lipschitz property of /, we get 

E|y,f'°°'^|2<(i + 7ME|E,[y,f;f]|2 

(16) 



+ Ch(h+^] [E/|(0, . . . , 0) + E|y,f + E|<'^'^|2] 



Bringing together terms E|l^^'°°'^p, then using (12) and the easy upper 
bound E/|(0, . . . , 0) < C(l + |5oP), it readily follows that 

I ife I - i_ch{h + l/j) ' ^ ^' 

, c(/. + i/7) (E|y,f:('^|^-E|E.Kf:('^]r), 



l-C/i(/i + l/7)' ' ^'^+1 ' ' ^'^+1 
provided that h is small enough. Take 7 = C to get 

E|y,f < Ch[l + \So\'] + (1 + Ch)E\Y,^;[^'f + C/.E|E,Kf;^'- 



(18) 



< Ch[l + |5op] + (1 + 2Ch)E\Yt^l{'^\^ 



with a new constant C. Plugging this estimate into (15) with i = I, we 
get Ely^f'^'^P < Ch[l + l^oP] + (1 + Ch)K\Yt^l['^\^ and, thus, by Gron- 

wall's lemma, supo<fc<7vE|y(^'^'^P < CA^{So)- This upper bound combined 
with (18), (15) and (12) finally provides the required uniform estimates for 
E|y,f'*'^|2 and E\Z^^f^\^: 

(19) supsup sup (E|y,f'*'^|V/iE|Z;^'^'^|2)<C7^^(5o). 

/>1 i>0 0<k<N ' 

Step 3: upper bounds for rj^'^ = E|yj^'^'^ — Yf^l"^. Note that r/^'^ = 0. 
Our purpose is to prove the following relation for <k < N: 

,?f'^<(l + C%K + C/i2^-U^(5o) 

(20) 

1=1 
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Note that the estimate on maxo<fc<ArE|y^^'^'^ — Y^^]"^ given in Theorem 2 
directly fohows from the relation above. With the arguments used to de- 
rive (15) and using the estimate (19), we easily get 

r?f < Ch''-W{So) + (1 + h)E\Y,^'^'' - Y,^f 

(21) = Ch'^'WiSo) + (1 + h)E\n,,jY,^)\' 

+ (l + /.)E|y,f'°^'^-7',„,,(y,f)|^ 

where we used at the last equality the orthogonality property relative to 

(22) E|y,f - Y/^f = mPoAY.^f + nYtT'' - V,,jY,^)f. 
Furthermore, with the same techniques as for (12) and (16), we can prove 

= ^E|7^,,,,(z,^J|VEE|<L'''-7'p,.(^^l)]r 

1=1 1=1 



(23) 



N \\2 



and 



1=1 



nY,^'^''-V,,,jY,^)\' 



(24) <(l + 7ME|EfeKf;^'^-y,f 



+ ch(h+ [E|y,f-'^ - p + Eiz,^'^ - z^f] 



7 



Replacing the estimate (23) in (24), choosing 7 = Cd and using (22) directly 
leads to 



{i-ch)m^^^''-v,,jY, 



tk 



(25) <(l + C/i)7?f4 



+ C/.^ E|7^,,, (Z,^ J P + C/^E|7^,„_, (y,f ) 1^ 
1=1 

Plugging this estimate into (21) completes the proof of (20). 
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Step 4: upper bounds for = hJ2^=Q^\Z^'^'^ — Z^\^ . We aim at show- 
ing 

N-l q 

e < ch^'~'A^{So) +chY: Emp^dzHtX 

k=0 1=1 

(26) 

+ E|7^,,,(y,f)|Vc^^max_^r?f'^. 

fc=0 

In view of (23), we have 

7V-1 q 



k=0 1=1 

N-l 

.N,I,I ^N^2 TC^rin. ,vN,I,I 



k=0 

Owing to (21) and (24), we obtain 



'o) 



< Ch^'-W{So 



+ Cm,,jY,^)f + [(1 + h){l + jh) - l]E|E,[y,f;('^ - Y.^jf 



7/ 

Taking 7 = ACd and h smah enough such that dC{h + ^) < ^, we have 
proved 

N-l q N-l 

k=0 1=1 k=0 

N-l 

- - fc=0 

But taking into account (22) and (25) to estimate 'E\Y^'^'°°'^ — Y/^]"^, we 
clearly obtain (26). This easily completes the proof of Theorem 2. □ 

5. Approximation results: step 3. This step is very analogous to step 
2, except that in the sequence of iterative least squares problems (9), the 
expectation E is replaced by an empirical mean built on M independent 
simulations of {Pi^)o<k<N, {'^Wk)o<k<N-i- This leads to the algorithm that 
is presented at Section 2.3. In this procedure, some truncation functions 
and pf^™ are used and we have to specify them now. 
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These truncations come from a priori estimates on Y^^'*'^, Z^^^'^ and it 

is useful to force their simulation-based evaluations y^'^''^'^'"^ , Z^^^'^'^^'^ 
to satisfy the same estimates. These a priori estimates are given by the 
following result (which is proved later). 

Proposition 2. Under (H1)-(H3), for some constant Cq large enough, 
the sequence of functions (/of^(-) = max(l, Co|p/,fc(-)|) :0<Z<g, 0<A;<A^ — 
1) is such that 

/or any i > 0, / > and < A; < iV - 1. 

With the notation of Section 2, the definition of the (random) truncation 
functions pf'^ (resp. p^^) follows. Note that they are such that: 

• they leave invariant • po,fc = ^t^'*'^ if / = or Vha]'!, ■ pi^k = VhZj^^'^'^ 
if / > 1 (resp. a^' -p^,^ if 1 = or Vhali ■ pf^^^ if / > 1); 

• they are bounded by 2p{'^,{Pll) [resp. 2pf,^{P^^'"')]- 

• their first derivative is bounded by 1; 

• their second derivative is uniformly bounded in N,l,k,m. 

Now, we aim at quantifying the error between iY^'^'^'^^ , y/hZ^^I^'^'^^)i^k and 
(1^^'^'^, \/hZ^^I_'^)i^k, in terms of the number of simulations M, the function 
bases and the time step h. The analysis here is more involved than in [6] since 
all the regression operators are correlated by the same set of simulated paths. 
To obtain more tractable theoretical estimates, we shall assume that each 
function basis pi^k is orthonormal. Of course, this hypothesis does not affect 
the numerical scheme, since the projection on a function basis is unchanged 
by any linear transformation of the basis. Moreover, we define the event 

Af = {V j G {A:, . . . , iV - 1} : WV^^ - Id|| < h, ||Po^j - Id|| < h 

(27) 

and ||P,^/ - Id|| < 1 for 1 < Z < g} 

(see the notation of Section 2 for the definition of the matrices V-^'^ and 
Pi^)- Under the orthonormality assumption for each basis pi^k^ the matrices 
(^fc^)o<fc<Af-i; {Pt,k)o<i<qfi<k<N-i converge to the identity with probabil- 
ity 1 as M — > oo. Thus, we have limjv/^oo ^(A^^) = 1. We now state our 
main result about the influence of the number of simulations. 

Theorem 3. Assume (H1)-(H3), / > 3, that each function basis pi^k is 
orthonormal and that E|p/^fc|^ < oo for any k,l. For h small enough, we have. 
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for any < k < N — 1, 



N-l 



j=k 



N~l N-l 



< 9 J2 E{\pfiPlJ)\\j^M^.) + Ch'-' 5: [1 + |5oP +E|pf )|^] 

j=k j=k 



c 



+ j^j:[nv,v*-id\\iE\pf{p, 



Nfr>N-^^2 



j=k 



+E{\v,\-'\po,^,mpi{Pi;)f 



v,\'{l + \S^^f + \po,j\'E\p'^j{P, 



2ti?I „N I r)N\i2 



The term with [A^^]'^ readily converges to as M — > cxd, but we have 
not made estimations more explicit because the derivation of an optimal 
upper bound essentially depends on extra moment assumptions that may 
be available. For instance, if pf (P/^) has moments of order higher than 2, 



we are reduced via Holder inequality to estimate the probability P([A^ ]^) < 

^j=fc L- \ll 'J 



Efjk'iniK' -id\\ > /.)+p(iipj|-^-id|i > -Mil > 1)]. We 

havePdiyfc^'^-Idll > h) < /i-2E||T4*'^-Id||2 < /i-2E||T4*'^-Id||^ = {Mh'^)-^E\\vkvl- 
Id|||.. This simple calculus illustrates the possible computations, other terms 
can be handled analogously. 

The previous theorem is really informative since it provides a nonasymp- 
totic error estimation. With Theorems 1 and 2, it enables to see how to 
optimally choose the time step h, the function bases and the number of sim- 
ulations to achieve a given accuracy. We do not report this analysis which 
seems to be hard to derive for general function bases. This will be addressed 
in further researches [19]. However, our next numerical experiments give an 
idea of this optimal choice. 

We conclude our theoretical analysis by stating a central limit theorem 
on the coefficients a),'^'^ as M goes to oo. This is less informative than 
Theorem 3 since this is an asymptotic result. Thus, we remain vague about 
the asymptotic variance. Explicit expressions can be derived from the proof. 



Theorem 4. Assume (H1)-(H3), that the driver is continuously dif- 
ferentiahle w.r.t. {y,z) with a bounded and uniformly Holder continuous 
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derivatives and that E|p/^fcp"^^ < oo for any k,l (e > 0). Then, the vector 
[V'M(a^'^'^^ — ajl^)]i<j^k<N-i weakly converges to a centered Gaussian vec- 
tor as M goes to oo. 



Proof of Proposition 2. In view of Proposition 1, it is tempting to 
apply a Markov property argument and to assert that Proposition 2 results 
from (19) written with conditional expectations E^. But this argumenta- 
tion fails because the law used for the projection is not the conditional law 
Efc but Eq. The right argument may be the following one. Write = 
ot^yPo^k{Pli)- On the one hand, by (19), we have CA^{Sq) > E|ytf'''^P = 
"oi -^Ebo.fcPS.fcl^o'fe ^ l« o',lPAmin(E[po,fcp* fc]). On the other hand, \y/^'''^\ < 
I «oi I Wk {Pl^ ) I < \Po,k I ^CA^{So)/X^U^\po,kPlk])- Thus, we can take (x) = 
max(l,|po,fc(x)|y'C^^(5o)/A^in(Ebo,fcPS,J))- Analogously, for Vh\Z^tf\, 
we have pf^^{x) = max(l, \pi^k{x)\^CA^ (Sq)/ X^in{E\pi^kPlk]))- Note that if 
Pl^k is an orthonormal function basis, we have XminO^\pi,kP^ k\) ~ ^ P^^' 
vious upper bounds have simpler expressions. □ 



Proof of Theorem 3. In the sequel, set 

^k'"" = ^ E KkiKnf, = ^ E i/r(o, . . . , o)p. 

m=l m=l 

Obviously, we have E(^f'*^) = E|/9^fc(P,^)|2 and E(i3f '^) < C(l + |5o|2). 
Now, we remind the standard contraction property in the case of least 
squares problems in M*^, analogously to the case L2(0,P). Consider a se- 
quence of real numbers (a;™)i<m<M and a sequence {v'^)i<rn<M of vec- 
tors in R", associated to the matrix V'^ = ■jjY^m=i'^^[^'^Y which is sup- 
posed to be invertible [Aniin(^^^) > 0]. Then, the (unique) R"-valued vector 



: arginfg \ x — 9 ■ v\\, is given by 



IM 

[V^ 

\/f 

m=l 

The application 6x is linear and, moreover, we have the inequality 



(28) 0. = ^HC^ E --^^ 



Mi-l M 

M 



(29) XmUV^')m^ < \0, ■ v\li < \x\Ij. 



For the further computations, it is more convenient to deal with 

1,1,1 



taiJM\* I i,I,M* rr i,I,M* rr i,I,M*-s 
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instead of a^^'^ . Then, the Picard iterations given in (4) can be rewritten 
, M 

(30)0;. ' ' = argmf — }^ (/3o,fc+i(«o,fcVi "^'o.fc+i) + ^/fc ("fc ) - ^' • «fc ) • 

^ m=l 

Introducing the event A^^, taking into account the Lipschitz property of the 
functions pff^ and using the orthonormaUty of pi^k, we get 



rNJJ 



1^ 1^ 



■J ' 

j=k 

N-1 

(31) <9EE(|pf(P,f)ri[A-]0 

j=k 

N-1 q 

, 117/-, I I.I,M I,I\2\ , t \ " V^TT7/-i I I,I,M I,I\2\ 

j=k 1=1 

To obtain Theorem 3, we estimate 1^^'^'^'^ — ^^'^P on the event A^^. This is 



achieved in several steps. 

Step 1: contraction pre 
are summed up in the following lemma: 



Step 1: contraction properties relative to the sequence {&k^''^^)i>o- They 



Lemma 1. For h small enough, on A^ the following properties hold: 

(a) |0^+i'^'*^ - ei''^'\^ < ch\ei''^' - 

(b) There is a unique vector 0'^'^'^'^ such that 



, M 

ooJM ■ r ^ sr^ / '•N,m j I,I,M m \ . i, rm^ oo,7,Mn a m\2 

= argmf — }^ (Po,fc+iK,fcVi "^'o.fc+i) + hfk{ak ) " ^ • ^'fc ) ■ 

^ m=l 



'k 



(c) We have \e^'''^' - ei'''^'\^ < [ChY\e^'''^'\\ 

Proof. We prove (a). Since l-h< Amin(Vfc*0 and Amax(i^/^) < 2 (0 < 

Z < g) on Af , in view of (29), we obtain that (1 - h)\9i^^'^'^' - 6'/'^^]^ is 
bounded by 



fi 



2 M 



Uk ("fc )-fkio^k )) 

m=l 



<Ch ^^la^^fc -a;;. I Xm,,APi,k, 

1=0 



<ch\9'/'^'-el-'''''''\\ 
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Now, statements (a) and (b) are clear. For (c), apply (a), reminding that 

Step 2: bounds for |^^'^'^^| on the event A^^. Namely, we aim at showing 
that 

(32) le,; ' I < C7(^^;^ + /i^^ ' ) onAf. 

We first consider « = oo. As in the proof of Lemma 1, we get 

1 M 

^ '^r^\^N,m I I,I,M m \ , u frnr oo,7,Mn-i2 



M ^ 

m=l 



< (1 + 7/^)-4S^ + Ch[h + i) [b^'^ + ± K'^'PA^IP,^) j . 

Take 7 = 8C and h small enough to ensure 2C{h + i)(l + /i) < i(l - /i). It 
readily follows l^'^'^^P < C(<4.f + /iSf '^0, 

proving that (32) holds for 
i = 00. Lemma 1(c) leads to expected bounds for other values of i. 

Step 3: we remind bounds for 9^'^ . Using Proposition 2 and in view of 
(10)-(14), we have, for i > 1, 

\eli\' <npf:,iPi:)\\ 0<l<q; 

(33) ... 

Remember also the following expression of derived from (10)-(13) and 
the orthonormality of each basis pi^k- 

(34) 9^^' = E{v,[aii^, ' Po,k+i + hf,{a^'')]). 

Step 4: decomposition of the quantity E(1^m|0^'''^'''^^ — 6*^'^^). Due to 
Lemma 1, on Af we get \9^''''' - ^^'^'^p < Ch'\9^'''^'\^ < Ch'\9^''\^ + 
^^/|^oo,/,Af_^oo,/|2_ ^^^g^ ^g-j^g ^33^^ readily follows that E{1j^m\91'^'^'^ - 

^fc'^P) is bounded by 

(i + Me(i^m|C''*'-C?) 



1 



nI,I,M /)00,/,M|2\ I \qI,I /)00,/|2i 



(35) +2[i+-){m^M\e'/^'-om')+K --k 1/ 

00,/, /^oo,/|2^ I r^ul-lv^\ „N I tdN\\2 



< (1 + c/i)e(1am|0^'^'^^'-' - 0^'^ 1^) + ch'-'np';^ 



tk 
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taking account that / > 3. On A^''^, V/^'^ is invertible and we can set 

i?i = (id-(y*0-')C'> 



B2 = m 



M\-l 



, M 

EK-Po,fc+iK,fc+i ■Po,k+i)) -j^l^Vk PoJk+Molk+i ■Po,k+i) 



m=l 



-. M 



oo,I\ 



m=l 



Ba 



M 



Em\'-N,m / /,/ m \ ^N,m / 1 ,1 ,M m 

^fc [Po,fc+lV"o,fe+l "Po.fc+lJ ~ Po,k+l\^0,k+l "Po.fc+i 



+M/r(«r')-/r(«r''''))]- 



Thus, by (28)-(34), we can write 6*^'^ - O"^'^'^^^ = B1 + B2 + B3 + B4, which 
gives on A^^ 

(36) ic' - ^ 3(1 + + + + (1 + /i)i54i'. 



S'iep 5: individual estimation of Bi, B2, B^, B4 on A^^. Remember 
the classic result [16]: if ||Id - F|| < 1, F^^ - Id = Efclipd - F]^ and ||Id - 
II < T^f^^- Consequently, for F = Vif", we get E{1j^m ||Id- (T/*'^)"! f) < 

(1 - h)-'E\\id - vi'r < (1 - h)-'E\\vi'^ - u\\i = (M(i - hrr'EWvkvi - 

IdW'jp. Thus, we have 

C. 



n\Bi\'lj,M) < ^E\\vkvl - Id|||E|pf 

A/\-l| 



Since on A^ one has ||(V/0" II < 2, it readily follows 
E(|S2plA-) < ^E{\vk\'\po,k+impUPl'j\', 



Em\H^M)<^E 



+ -,i:M'np^,k{pr'' 



h 



1=1 



As in the proof of Lemma 1 and using ||-P(f{_|_i || < 1 + /i on A^, we easily 
obtain 



(1 - h)\B4\^ < (1 + h)(l + 7/i)l«oi+i - 4i 



+1I 



+ch[h+-]Y, 



I 00,/ oo,/.Af|2 

\ai u — a ' ' 



1=0 
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Step Q: final estimations. Put = E||'(;fc?;^-Id|||E|;of (Pt^)|2 + E(|t;fc|2|po fc+iP)]E|/9^fc(Pi^)P + 

the above estimates on -Bi, i?2) ^4 into (36), choose 7 = 3C and h close 
to to ensure C/i + ^ < ^; after simphfications, we get 

But in view of Lemma 1(c) and estimates (32)-(33), we have 

^(lAf l"0,fc+l -"0,fc+ll ) 

< (1 + h)W.{l^M\a'^^l^^^ - a[;^;'if P) 

Finally, we have proved 
IE(lAflC'''^-Cl') 

- ^/S^ + + + + E|p^,^.2«jr) 

+ (l + CME(l^M|a~/;f-ao%|2). 

Using a contraction argument as in (35), the index 00 can be replaced by 
/, without changing the inequality (with a possibly different constant C). 
This can be written 

117/-, I I,I,M I,I\2\ I tY^TE^/-! I I.I.M I,I\2\ 

^(lAfl"0,V -"oifcl ) + ^Z^^(lAf -"z/fcl ) 

1=1 

+ (1 + C7/i)E(lAM|a^;^:ff - Ooi+iP). 
Using Gronwall's lemma, the proof is complete. □ 

Remark 1. The attentive reader may have noted that powers of h are 
smaller here than in Theorem 2, which leads to take I > 3 instead of / > 2 
before. Indeed, we cannot take advantage of conditional expectations on the 
simulations as we did in (12), for instance. 

Note that in the proof above^ we only use the Lipschitz property of the 
truncation functions and ^ . 

Proof of Theorem 4. The arguments are standard and there are 
essentially notational difficulties. The first partial derivatives of / w.r.t. y 
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and zi are, respectively, denoted d^f and dif . The parameter (5 G ]0, 1] stands 
for their Holder continuity index. Suppose w.l.o.g. that e < (3 and that each 
function basis pi^^ is orthonormal. For A; < — 1, define the quantities 

, M 

m=l 
, M 



m=l 



Cf(a) 



For A; = — 1, we set B^^ = and in C^^(a), the terms agfe+i ■ PcTfc+i ^^^^ 
'^o'fc+i "Po.fe+i have to be replaced, respectively, by ^'^{P^'"^) and ^^{PI^). 
The definitions of Afj^^a) and are still valid. For convenience, we write 
j^M ^ ^£ ^j^g (possibly vector or matrix valued) sequence {X^^)m weakly 
converges to a centered Gaussian variable, as M goes to infinity. For the con- 
vergence in probability to a constant, we denote X^^ Since simulations 
are independent, observe that the following convergences hold: 

(37) 



l^i,fcl"fc A-Dfc --^k )i<I-l,l<q,k<N-l 



— {C^\a^ )^^k)i<I-l,l<q,k<N-l 



Note that limA^_>oo == Id is invertible. Linearizing the functions / and 
™ expressions of 0^^ = E(t;fe[aJ'^+;^ -po^k+i + hfkiaQ^k^ , %~k'^)]) 



k 

g,k 



and 0^'-^'^ given by (28) leads to 



(38) - /^E^"i(«r''')^«.''''"' - o^') 

1=0 

^1 "-^1 IJM 1,1 |2 \ " I mil m 

< lfc<7V-i^|ao,fc+i - "Cfc+ll 2^ \^k \\P0,k+l 
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M 



c 



To get Theorem 4, we prove by induction on k that ([a/M(^*'^'^^ — 6'*'^)]j>fc,i</, G^^) ^ 
Remember that 0^'^'^^ = 0^'^ = for any j. Consider first k = N — 1, for 

which = 0, and i = 1. In view of (37)-(38), clearly {[VM{9%^i^ -9%^_^)],<i,g^') - 
For 1 = 2, we may invoke the same argument using (37)-(38) and obtain 
{[VM{d]^^l^ — 9^N^i)]i<2,G^^) provided that the upper bound in (38) con- 
verge to in probability. To prove this, put M^^ = M"^"'^/'^ x J2m=i l^N-il \Pn-i I 
and write ^l^'jf - a]^W^^Ef^=i x \Pn-i\'^^ = \VM{a]^'lf - 

a]^^_i)\^~^^A4^^ . Since [^/M {a]^^J_i^ — a]v^i)]M is tight, our assertion holds if 

converges to as M ^ oo. Note that \vn-i\\pn-i\'^~^'^ G L(^2+e)/{2+i3){^)- 
Thus, the strong law of large numbers, in the case of i.i.d. random variables 
with infinite mean, leads to J2m=i l^iv-il x |Pjv-il^"^^ = 0(M(2+/3)/(2+£)+r) 
a.s. for any r > 0. Consequently, from the choice of r small enough, it follows 
A^^^^Oa.s. 

Iterating this argumentation readily leads to ([\/M {O^fJ^^ — ^]{/_i)]i<i, ^*'^) 
For the induction for k < N — 1, we apply the techniques above. There is an 
additional contribution due to B^^ , which can be handled as before. □ 

6. Numerical experiments. 

6.1. Lipschitz property of the solution under (H4). To use the algorithm, 
we need to specify the basis functions that we choose at each time tk and for 
this, the knowledge of the regularity of the functions (•) and ^;/\,(-) from 
Proposition 1 is useful (in view of Theorem 2). In all the cases described 
in Section 2.4 and below, assumption (H4) is fulfilled. Under this extra 
assumption, we now establish that y^ (•) and ^/^(•) are Lipschitz continuous. 

Proposition 3. Assume (II1)-(II4). Fork small enough, we have 

(39) |y,^(x) - 4(x')| + Vh\zg{x) - z,^(x')| <C\x- x'\ 

uniformly in kQ < N — 1. 

Proof. As for (17), we can obtain 

jg|yAf,A:o,x _ y7V,feo,a;'|2 
' ^k ' 

-l-C/i(/i + l/7) ' *^+^ ^' 
, Ch{h + lh) 

+ l-CM/i + l/7) ' ~ ' 
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, C{h + lh) 
1 - C/i(/i + 1/7) ^ ' ~ ' 

-]E|E,(y,f;^^-y,f;^-')|2). 

Choosing 7 = C and h small enough, we get (for another constant C) 

< (1 + Ch)E\Y,^ - + ChE|5f - 5f 

The last term above is bounded by C\x — under assumption (HI). Thus, 
using Gronwall's lemma and assumption (H4), we get the result for y^(-)- 
The result for \//iz^(-) follows by considering (5). □ 

6.2. Choice of function bases. Now, we specify several choices of function 



Hypercuhes (HC in the following). Here, to simplify, pi^k does not de- 

' 



bases. We denote d! (> d) the dimension of the state space of {Ptl)k- 

!re, to simphfy, pi^k 
pend on I or k. Choose a domain D <ZW^ centered on Pq , that is, D 



HiLil-Poi — P-iPoi + P]^ partition it into small hypercubes of edge 5. 
Thus, D = U„...,-^, Al,...,^,, where Ai,...,*,, =]P^^i- R + ii5,P^^^- R+ + 
1)5] X • • • x]Pq^^, -R + id'5, P^^, -R + {id' + 1)5] • Then we define pi^k as the in- 
dicator functions associated to this set of hypercubes: fc(-) = (Idj. , (■))«! 
With this particular choice of function bases, we can explicit the projection 
error of Theorem 2: 

<E{\Y,^\Hn^{Pl'j)+ E(lA^....,^,(P,^)|yf«)-yf(x.„...,J|^ 

<C6' + n\Y,^\Hn4Pil)), 
where is an arbitrary point of Di^^ i^, and where we have used the 

tfe 



Lipschitz property of on D. To evaluate E(|y^^pl£)c(Pj^)), note that, 
by adapting the proof of Proposition 3, we have < ^^(1 + 15*^1^ + 

EfclP^^P). Thus,ifsupfc^^E|Pt^|" < cxDfor a > 2, we have E(|y;,^|2lBc(P^^)) < 
^1-2 1 with an explicit constant Cq. The choice R ~ /i^2/(a-2) S = h leads 
to 

The same estimates hold for E|7?.p; ^(\//iZ/^^)p. Thus, we obtain the same 
accuracy as in Theorem 1. 

Voronoi partition (VP). Here, we consider again a basis of indicator func- 
tions and the same basis for aW <l < q. This time, the sets of the indicator 
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functions are an open Voronoi partition ([17]) whose centers are indepen- 
dent simulations of P^. More precisely, if we want a basis of 20 indicator 
functions, we simulate 20 extra paths of , denoted {P^''^'^~^^)i<i<20, in- 
dependently of {P^'"^)i<m<M ■ Then we take at time (-Pt^'*^^*)i<i<20 to 
define our Voronoi partition {Ck^i)i<i<20, where Ck^i = {x:\x — P^^'*^"*"*! < 

infjyi |x — Thenp/^fc(-) = {Ic^. We can notice that, unlike the 
hypercubes basis, the function basis changes with k. We can also estimate 
the projection error of Theorem 2, using results on random quantization and 
refer to [17] for explicit calculations. 

In addition, we can consider on each Voronoi cells local polynomials of 
low degree. For example, we can take a local polynomial basis consisting 
of l,a;i, ■ ■ ■ ,Xd' for po,/c and 1 for pi^k (l > 1) on each Ck^i- Thus, Po,fc(2;) = 
(lCfe,.(2;),a;ilCfe.,(a;), . • . ,Xd>lCkXx))i and pi^kix) = (lcfe,,(2;))i, l<l<q.We 
denote this particular choice VP(1,0), where 1 (resp. 0) stands for the max- 
imal degree of local polynomial basis for po,fc (resp. pi^k: 1 ^ ^ !^ 

Global polynomials (GP). Here we define po,fc as the polynomial (of d' 
variables) basis of degree less than dy and pij^ as the polynomial basis of 
degree less than dz- 

6.3. Numerical results. After the description of possible basis functions, 
we test the algorithm on several examples. For each example and each choice 
of function basis, we launch the algorithm for different values of M, the 
number of Monte Carlo simulations. More precisely, for each value of M, 
we launch 50 times the algorithm and collect each time the value 

The set of collected values is denoted (^t^/'^'*^)i<i<50- Then, we compute 

the empirical mean y^'^'^'*^ = ^ Y^i=i Y^^'/'^'^^ and the empirical standard 

deviation cr^^' = y ^2_^i^i\Y^^^l ~^to I - These two statistics 
provide an insight into the accuracy of the method. 

6.3.1. Call option with different interest rates 4- We follow the nota- 
tion of Section 2.4 considering a one-dimensional Black-Scholes model, with 
parameters 



fi a 


r R 


T 


5o 


K 


0.06 0.2 


0.04 0.06 


0.5 


100 


100 



Here K is the strike of the call option: $(S) = {St — K)+. We know by 
the comparison theorem for BSDEs [12] and properties of the price and 
replicating strategies of a call option, that the seller of the option has always 
to borrow money to replicate the option in continuous time. Thus, Yq is given 
by the Black-Scholes formula evaluated with interest rate R : Yq = 7.15. This 
is a good test for our algorithm because the driver / is nonlinear, but we 
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nevertheless know the true value of Yq. We test the function basis HC for 
different values of N, D and 6. Results {Y^^' ' ' and cr^'^'^'*^ in parenthesis) 

are reported in Table 1, for different values of M. Clearly, ^^'^'^'^^ converges 
toward 7.15, which is exactly the Black-Scholes price Yq calculated with 
interest rate R. We observe a decrease of the empirical standard deviation 
like 1/\/M, which is coherent with Theorem 4. 

6.3.2. Calls combination with different interest rates. We take the same 
driver / but change the terminal condition: $(S) = {St — -f^i)"'' — 2(5"^ — 
K2)~^ . We take the following values for the parameters: 



fx a r 


R 


T 


So 




K2 


0.05 0.2 0.01 


0.06 


0.25 


100 


95 


105 



We denote by BSi{r) the Black-Scholes price evaluated with strike A', and 

interest rate r. If we try to predict Yq by a linear combination of Black- 
Scholes prices, we get 

BSi{R)-2BS2{R) 2:75 

BSi{r)-2BS2{r) 2.76 

BSi{r)-2BS2{R) 1.92 

BSilR)-2BS2{r) 3.60 

Using comparison results, one can check that the first three rows provide a 
lower bound for Yq, while the fourth row gives an upper bound. According 

to the results of HC and VP, Y^.^ ' ' seems to converge toward 2.95. This 
value is not predicted by a linear combination of Black-Scholes prices: in 
this example, the nonlinearity of / has a real impact on Yq. The financial 
interpretation is that the option seller has alternatively to borrow and to 
lend money to replicate the option payoff. 

Comparing the different choices of basis functions, we can notice that the 
column = 5 of VP (Table 3) shows similar results with an equal number 
of basis functions than the column = 5 of HC (Table 2). In Table 3, the 
last two columns show that using a local polynomial basis may significantly 



Table 1 

Results for the call option using the basis HC 



M 


N = 5, D = [60, 140] ,3 = 5 


N = 10, D = [60, 140], 6 = 1 


128 


6.83(0.31) 


7.02(0.51) 


512 


7.08(0.11) 


7.12(0.21) 


2048 


7.13(0.05) 


7.14(0.07) 


8192 


7.15(0.03) 


7.15(0.03) 


32768 


7.15(0.01) 


7.15(0.02) 
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Table 2 

Results for the calls combination using the basis HC 





JV = 5 


AT = 20 


AT = 50 


M 


D = [60, 140] 


D = [60, 200] 


D = [40, 200] 




6 = 5 


6 = 1 


6 = 0.5 


128 


3.05(0.27) 


3.71(0.95) 


3.69(4.15) 


512 


2.93(0.11) 


3.14(0.16) 


3.48(0.54) 


2048 


2.92(0.05) 


3.00(0.03) 


3.08(0.12) 


8192 


2.91(0.03) 


2.96(0.02) 


2.99(0.02) 


32768 


2.90(0.01) 


2.95(0.01) 


2.96(0.01) 



Table 3 

Results for the calls combination using the bases VP and VP(1,0) 





Basis VP 


Basis VP 


Basis VP 


Basis VP(1,0) 


M 


16 Voronoi regions 


64 Vor. reg. 


10 Vor. reg. 


10 Vor. reg. 




N = 5 


AT = 20 


N = 20 


AT = 20 


128 


3.23(0.30) 


4.50(1.71) 


3.08(0.25) 


3.23(0.23) 


512 


3.05(0.13) 


3.36(0.10) 


2.91(0.11) 


3.03(0.08) 


2048 


2.94(0.06) 


3.05(0.04) 


2.90(0.06) 


2.97(0.04) 


8192 


2.92(0.03) 


2.96(0.02) 


2.86(0.03) 


2.95(0.02) 


32768 


2.90(0.02) 


2.94(0.01) 


2.86(0.02) 


2.95(0.01) 



increase the accuracy. We also remark by considering the rows M = 128, 512 
of Table 2 that the standard deviation increases with N and the number of 
basis functions, which is coherent with Theorem 3. Finally, from Table 4 the 
basis GP also succeeds in reaching the expected value, as we increase the 
number of polynomials in the basis. 

6.3.3. Asian option. The dynamics is unchanged (with d = q = 1) but 
now the interest rates are equal {r = R). The terminal condition equals 
<I>(S) = Jo Stdt — K)^ and we take the following parameters: 



fi cr 


r T 


So 


K 


0.06 0.2 


0.1 1 


100 


100 



To approximate this path-dependent terminal condition, we take d! = 2 and 
simulate P^^ = iS^,j^J2i=0'St!)* (see [18]). The results presented in Ta- 
ble 5 are coherent because the price given by the algorithm is not far from 
the reference price 7.04 given in [18]. 

As mentionned in [18], the use of -^yi-j- J2iLo 'St^ to approximate ^ /q St dt 
is far from being optimal. We can check what happens if we change P'^ to 
better approximate y Jq St dt. As proposed in [18], we approximate ^ Jq St dt 
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Table 4 

Results for the calls combination using the basis GP 





iV = 5 


N = 20 


N = 50 


N = 50 


M 


dy = 1, = 


dy = 2, = 1 


= 4, = 2 


dy =9, d:, =9 


128 


2.87(0.39) 


3.01(0.24) 


3.02(0.22) 


3.49(1.57) 


512 


2.82(0.20) 


2.94(0.12) 


2.97(0.09) 


3.02(0.1) 


2048 


2.78(0.07) 


2.92(0.07) 


2.92(0.04) 


2.97(0.03) 


8192 


2.78(0.05) 


2.92(0.04) 


2.92(0.02) 


2.96(0.01) 


32768 


2.79(0.03) 


2.91(0.02) 


2.91(0.01) 


2.95(0.01) 



Table 5 

Results for the Asian option using the basis HC 





N = 5 


N = 20 


N = 50 


M 


6 = 5 


6 = 1 


6 = 0.5 




D = [60,200]^ 


D = [60,200]^ 


D = [60,200]^ 


128 


6.33(0.41) 


4.47(3.87) 


3.48(13.08) 


512 


6.65(0.21) 


6.28(0.76) 


5.63(2.37) 


2048 


6.80(0.09) 


6.76(0.24) 


6.48(0.49) 


8192 


6.83(0.04) 


6.95(0.06) 


6.86(0.12) 


32768 


6.83(0.02) 


6.98(0.03) 


6.99(0.04) 



by ^ Eilo' S^{1 + /i| + ^AWu), which leads to P,^ = (5,^, i ^-=0 S!!il + 
+ f AVFtJ)* for k>l. The results (see Table 6) are much better with this 
choice of . Once more, we observe the coherence of the algorithm which 
takes in input simulations of under the historical probability ^ r) and 
corrects the drift to give the risk-neutral price. 

7. Conclusion. In this paper we design a new algorithm for the numer- 
ical resolution of BSDEs. At each discretization time, it combines a finite 
number of Picard iterations (3 seems to be relevant) and regressions on func- 
tion bases. These regressions are evaluated at once with one set of simulated 
paths, unlike [6], where one needs as many sets of paths as discretization 
times. We mainly focus on the theoretical justification of this scheme. We 
prove L2 estimates and a central limit theorem as the number of simulations 
goes to infinity. To confirm the accuracy of the method, we only present few 
convincing tests and we refer to [19] for a more detailed numerical analysis. 
Even if no related results have been presented here, an extension to reflected 
BSDEs is straightforward (as in [6]) and allows to deal with American op- 
tions. At last, we mention that our results prove the convergence of the 
Hedged Monte Carlo method of Bouchaud, Potters and Sestovic [5], which 
can be expressed in terms of BSDEs with a linear driver. 
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Table 6 

Results for the Asian option, using a better approximation of 
i St dt and the basis HC (N = 20, 5 = 1, D = [60, 200]^; 



M 


2 


8 


32 


128 


512 


2048 


8192 


32768 




2.26 


0.90 


4.49 


6.68 


6.15 


6.88 


6.99 


7.02 


NJ.I.M 


4.08 


7.80 


11.27 


4.64 


1.11 


0.21 


0.07 


0.02 
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